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- - - Digital applications have developed rapidly over the last few decades. Since many sources of information 
OO ■ 

O ' are of analog or continuous-time nature, discrete-time signal processing (DSP) inherently relies on sampling a 

CN continuous-time signal to obtain a discrete-time representation. Consequently, sampling theories lie at the heart of 
O 

signal processing devices and communication systems. Examples include sampling rate conversion for software 



radio (H and between audio formats lH, biomedical imaging |3], lens distortion correction and the formation of 
image mosaics HI, and super-resolution of image sequences ||5l . 
^ 1 ■ To accommodate high operating rates while retaining low computational cost, efficient analog-to-digital (ADC) 
and digital-to-analog (DAC) converters must be developed. Many of the limitations encountered in current con- 
verters is due to a traditional assumption that the sampling stage needs to acquire the data at the Shannon-Nyquist 
I rate, corresponding to twice the signal bandwidth 0, Q, HI. To avoid aliasing, a sharp low -pass filter (LPF) must 
be implemented prior to sampling. The reconstructed signal is also a bandlimited function, generated by integer 

o 

m 



shifts of the sine interpolation kernel. 

A major drawback of this paradigm is that many natural signals are better represented in alternative bases 

other than the Fourier basis ||9l, ifTOl . ifTTl . or possess further structure in the Fourier domain. In addition, ideal 
OO . 

O ■ pointwise sampling, as assumed by the Shannon theorem, cannot be implemented. More practical ADCs introduce 
.!_H ' a distortion which should be accounted for in the reconstruction process. Finally, implementing the infinite sine 
5_j interpolating kernel is difficult, since it has slow decay. In practice, much simpler kernels are used such as linear 



interpolation. Therefore there is a need to develop a general sampling theory that will accommodate an extended 
class of signals beyond bandlimited functions, and will account for the nonideal nature of the sampling and 
reconstruction processes. 

Sampling theory has benefited from a surge of research in recent years, due in part to the intense research in 
wavelet theory and the connections made between the two fields. In this survey we present several extensions of 
the Shannon theorem, that have been developed primarily in the past two decades, which treat a wide class of input 
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signals as well as nonideal sampling and nonlinear distortions. This framework is based on viewing sampling in a 
broader sense of projection onto appropriate subspaces, and then choosing the subspaces to yield interesting new 
possibilities. For example, our results can be used to uniformly sample non-bandUmited signals, and to perfectly 
compensate for nonlinear effects. 

Our focus here is on shift-invariant (SI) settings in which both sampling and reconstruction are obtained 
by filtering operations, and the sampling grid is uniform. However, all the results herein can be extended to 
arbitrary Hilbert space settings 1121 . fT3l . |[T4l . |[T5l including finite-dimensional spaces, spaces that are not SI 
and nonuniform sampling. Our exposition is based on a Hilbert-space interpretation of sampling techniques, and 
relies on the concepts of bases and projections. This perspective has been motivated in the context of sampling 
in the excellent review by Unser ifTTI . Here we consider a similar setting and complement the paper of Unser by 
surveying further progress made in this area in recent years. 

We begin by presenting a broad class of sampling theorems for signals confined to an arbitrary subspace in 
the presence of non-ideal sampling, and nonlinear distortions. Surprisingly, many types of nonlinearities that are 
encountered in practice do not pose any technical difficulty and can be completely compensated for. Next, we 
develop minimax recovery techniques that best approximate an arbitrary smooth input signal. These methods can 
also be used to reconstruct a signal using a given interpolation kernel that is easy to implement, with only a 
minor loss in signal quality. To further enhance the quality of the interpolated signal, we discuss fine grid recovery 
techniques in which the system rate is increased during reconstruction. As we show, the algorithms we develop 
can all be extended quite naturally to the recovery of random signals. These additional aspects extend the existing 
sampling framework and incorporate more realistic sampling and interpolation models. 

Before proceeding with the detailed development, we note that an additional topic in the context of sampling that 
has received growing attention recently is that of reconstructing signals that are known to be sparse in some domain. 
This class of problems underlies the emerging field of compressed sensing ||T6l . ITTll . However, this framework has 
focused primarily on sampling of discrete signals and reconstruction techniques from a finite number of samples, 
while our interest here is on sampling and reconstructing analog continuous-time signals from uniform samples. 
Some exceptions are the work in ITSll . |[T9l . |[20l . |[2TI which describe examples of compressed sensing for analog 
signals, and the work on finite-rate of innovation 1221 . 1231 . In the last section, we very briefly touch on this 
important area. 

I. Sampling and Reconstruction Setup 

The Shannon sampling theorem (also attributed to Nyquist, Whittaker and Kotelnikov) states that a signal x{t) 
bandlimited to vr/T can be recovered from its uniform samples at time instants nT. Reconstruction is obtained by 
filtering the samples with a sine interpolation kernel: 
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where sinc(t) = sin(7rt)/(7rt). Although widely used, this theorem relies on three fundamental assumptions that 
are rarely met in practice. First, natural signals are almost never truly bandlimited. Second, the sampling device 
is usually not ideal, that is, it does not produce the exact signal values at the sampling locations. A common 
situation is that the ADC integrates the signal, usually over small neighborhoods surrounding the sampling points. 
Moreover, nonlinear distortions are often introduced during the sampling process. Finally, the use of the sine kernel 
for reconstruction is often impractical due to its very slow decay. 

To design interpolation methods that are adapted to practical scenarios, there are several issues that need to be 
properly addressed: 

1) The sampling mechanism should be adequately modeled; 

2) Relevant prior knowledge about the class of input signals should be taken into account; 

3) Limitations should be imposed on the reconstruction algorithm in order to ensure robust and efficient recovery. 
In this review we treat each of these three essential components of the sampling scheme. We focus on several models 
for each of the ingredients, which commonly arise in signal processing, image processing and communication 
systems. The setups we consider are summarized in Table Jl and are detailed in the ensuing subsections. Table |ll] 
indicates the design objective used in each scenario. As we discuss further below, different priors dictate distinct 
objectives. For example, when the only information we have about the signal is that it is smooth, then the error 
cannot be minimized uniformly over all signals, and alternative design strategies are needed. 

A. Sampling Process 

1) Linear Distortion: In the Shannon sampling theorem, x{t) is bandlimited to vr/T and thus an equivalent 
strategy is to first filter the signal with a LPF with cut-off vr/T and then uniformly sample the output. This 
interpretation is depicted in Fig. [J with s{—t) = sinc(i/T) being the impulse response of the LPF. The samples 
c[n] can be expressed as 

c[n]= x{t)s{t-nT)dt = {x{t),s{t-nT)), (1) 

J t=~oo 

where {ii{t),h{t)) denotes the L2(M) inner product between two finite-energy continuous-time real signals. For 
simplicity, throughout the paper we assume a sampling interval of T = 1. 
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Fig. 1: Shift-invariant sampling. Filtering the signal x{t) prior to taking ideal and uniform samples, can be 
interpreted as L2 inner-products between x{t) and shifts of s{t). Shannon's framework corresponds to the choice 
s{-t) = sinc(t/r). 
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Fig. 2: Nonlinear and shift-invariant sampling. The signal amplitudes x{t) are distorted by the nonlinear mapping 
M prior to shift-invariant sampling. 



In practical applications the sampling is not ideal. Therefore, a more realistic setting is to let s{t) be an arbitrary 
sampling function. This allows to incorporate imperfections in the ideal sampler into the function s{t) |[24l . lfT4l . 
II25I . |[T2l . As an example, typical ADCs average the signal over a small interval rather than outputting pointwise 
signal values. This distortion can be taken into account by modifying s{t) to include the integration. 

2) Nonlinear Distortion: A more complicated situation arises when the sampling process includes nonlinear 
distortions. One simple approach to model nonlinearities is to assume that the signal is distorted by a memoryless, 
nonlinear, invertible mapping prior to sampling by s{—t), as in Fig. |2] This rather straightforward model is general 
enough to capture many systems of practical interest. Nonlinearities appear in a variety of setups and applications of 
digital signal processing including power electronics |26|, radiometric photography |27|, and CCD image sensors. 
In some cases, nonUnearity is insinuated deliberately in order to increase the possible dynamic range of the signal 
while avoiding amplitude clipping, or damage to the ADC ||28l . 

B. Signal Priors 

In essence, the Shannon sampling theorem states that if x{t) is known a priori to lie in the space of bandlimited 
signals, then it can be perfectly recovered from uniformly-spaced ideal samples. Clearly, the question of whether 
x{t) can be recovered from its samples depends on the prior knowledge we have on the class of input signals. 
In this review we depart from the traditional bandlimited assumption and discuss signal priors that appear more 
frequently in signal processing and communication scenarios. 

1) Subspace Priors: Our first focus is on signal spaces that are SI. A SI subspace A oi L2, is the space of 
signals that can be expressed as linear combinations of shifts of a generator a{t) |[29]| : 

00 

x{t) = ^ b[n]a{t - n), (2) 

n=— 00 

where b[n] is an arbitrary norm-bounded sequence. Note that b[n] does not necessarily correspond to samples of 
the signal, that is x{n) ^ b\n\ in general. Choosing a(f) = sinc(t) results in the space of vr-bandlimited signals. 
However, a much broader class of signal spaces can be defined including spline functions ifTTI . In these cases a(t) 
can be easier to handle numerically than the sine function. 
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A spline f{t) of degree is a piecewise polynomial with the pieces combined at knots, such that the function 
is continuously differentiable — 1 times. It can be shown that any spline of degree A^ with knots at the integers 
can be generated using (O by a i?-spline a{t) of degree A^, which is the function obtained by the (A^ + l)-fold 
convolution of the unit square 



Signals of the type Q are also encountered when the analog signal to be sampled originated from a digital 
source. For example, in communication systems, signals of this form are produced by pulse amplitude modulation. 
Extensive research in this field has been devoted to design receivers that undo the effect of inter-symbol interference, 
caused by overlap of the pulses a{t — n). Here we provide a geometric interpretation of this problem, which leads 
to insight into which classes of signals can be perfectly recovered from their samples. This viewpoint also allows 
to incorporate various constraints on the reconstruction method. 

Although the discussion in this article is limited to SI subspaces, the results we present are valid in more 
general subspaces as well |[T2l . |[T3l . In particular, the results can be extended straightforwardly to SI subspaces 
with multiple generators |[30l . jSTl . 1211 . In this case, the filters figuring in the sampling and reconstruction are 
replaced by a bank of filters, and the digital correction is replaced by a multichannel correction system. This allows 
to treat, for example, signals whose spectrum is contained in several frequency bands. 

2) Smoothness Priors: Subspace priors are very useful because, as we will see, they often can be utilized to 
perfectly recover x{t) from its samples. However, in many practical scenarios our knowledge about the signal is 
much less complete and can only be formulated in very general terms. An assumption prevalent in image and 
signal processing is that natural signals are smooth in some sense. Here we focus on approaches that quantify the 
extent of smoothness using the L2 norm where = (/(t), /(t)), and L is usually chosen as some 
differential operator. The appeal of these models stems from the fact that they lead to linear recovery procedures. 
In contrast, smoothness measures such as total variation, result in nonlinear interpolation techniques. 

The class of "smooth" signals is much richer than its subspace counterpart. Consequently, it is often impossible 
to distinguish between one "smooth" signal and another based solely on their samples. In other words, the sampling 
process inevitably entails information loss. Since perfect recovery cannot be attained in this scenario, we focus on 
two alternative criteria: consistency (or least-squares) and a worst case (minimax) design. 

3) Stochastic Priors: The last family we consider in detail is the family of stochastic priors. In this category, the 
signal x{t) is modeled as a wide-sense stationary (WSS) random process with known power spectral density (PSD), 
a viewpoint prominent in the field of statistical signal processing. As a design criterion, we focus on minimization 
of the mean-squared error (MSE) given the signal samples. The theory of sampling random signals has developed 
in parallel lines to its deterministic counterpart |8]. Interestingly, the stochastic setting leads to reconstruction 
techniques that are very similar to the methods arising from the smoothness priors. This provides an interesting 
equivalence between the smoothness operator L and the PSD of x{t) in the random setup. Furthermore, we show 
that the study of statistical priors also sheds some light on the origin of artifacts, which are commonly encountered 
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Fig. 3: Reconstruction using a digital compensation filter h[n] and interpolation kernel w{t). 

in traditional interpolation methods. 

4) Sparsity Priors: In the last section we very briefly touch on sparsity priors. This class of functions lead to 
nonlinear reconstruction algorithms that have a very different structure than the linear interpolation methods in the 
majority of this paper. Since the treatment of these priors differs substantially from the rest of the review, we only 
point to several basic recovery techniques and results in this emerging area. A more detailed discussion merits a 
separate paper. 



C. Reconstruction Methods 

For a sampling theorem to be practical, it must take into account constraints that are imposed on the interpolation 
method. One aspect of the Shannon sampling theorem, which renders it unrealizable, is the use of the sine 
interpolation kernel. Due to its slow decay, the evaluation of x{t) at a certain time instant to, requires using a large 
number of samples located far away from to- In many applications, reduction of computational load is achieved 
by employing much simpler methods, such as linear interpolation. In these cases the sampling scheme should be 
modified to compensate for the chosen non-ideal kernel. 

1) Unconstrained Reconstruction: The first setup we consider is unconstrained recovery. Here, we design 
interpolation methods that are best adapted to the underlying signal prior according to the objectives in Table JIJ 
without restricting the reconstruction mechanism. In these scenarios, it is sometimes possible to obtain perfect 
recovery, as in the Shannon sampling theorem. The unconstrained reconstruction methods under the different 
scenarios treated in this paper (besides the case in which there are nonlinear distortions) all have a common 
structure, depicted in Fig. [3] Here w{t) is the impulse response of a continuous-time filter, which serves as the 
interpolation kernel, while h[n] represents a discrete-time filter used to process the samples prior to reconstruction. 
Denoting the output of the discrete-time filter by d[n\, the input to the filter w{t) is a modulated impulse train 
En d[n\5{t — n). The filter's output is given by 

oo 

x{t) = ^ d[n]w{t - n). (4) 

n=— oo 

Optimal interpolation kernels resulting from such considerations are typically derived in the frequency domain 
but very often do not admit a closed form in the time domain. This limits the applicability of these recovery 
techniques to situations in which the kernel needs to be calculated only on a discrete set of points. The discrete 
Fourier transform (DFT) can be used in such settings to approximate the desired values. Consequently, these 
methods seem to have been used, for example, in the image processing community only as a means of enlarging 
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an image by an integer factor ll32l . |[33l . More general geometrical transformations, such as rotation, lens distortion 
correction and scaling by an arbitrary factor, are typically not tackled using these techniques. One way to resolve 
this problem is to choose the signal prior so as to yield an efficient interpolation algorithm, as done e.g., in 041 in 
the context of exponential B-splines. Nevertheless, this approach restricts the type of priors that can be handled. 

2) Predefined Kernel: To overcome the difficulties in implementing the unconstrained solutions, we may resort 
to a system that uses a predefined interpolation kernel that is easy to implement. In this setup, the only freedom 
is in the design of the digital correction filter h[n\ in Fig. [3l which may be used to compensate for the non-ideal 
behavior of the pre-specified kernel w{t) ll24l . |[T2l . lITSl . |[T3l . |[T4l . The filter h[n] is selected to optimize a 
criterion matched to the signal prior. 

By restricting the reconstruction to the form we are essentially imposing that the recovered signal x{t) lie 
in the SI space generated by the pre-specified kernel w{t). The class of SI spaces is very general and includes 
many signal spaces that lead to highly efficient interpolation methods. For example, by appropriate choice of w{t) 
the family of splines can be described using B-splines have been used for interpolation in the mathematical 
literature since the pioneering work of Schonberg ||35l . In signal processing applications the use of B-splines 
gained popularity due to the work of Unser et. al. that showed how B-spline interpolation can be implemented 
efficiently |[36l . |[37l . Interpolation using splines of degree up to 3 is very common in image processing, due to 
their ability to efficiently represent smooth signals and the relatively low computational complexity needed for 
their evaluation at arbitrary locations. 

3 ) Fine Grid Interpolation: Constraining the interpolation kernel may lead to severe degradation of the recon- 
struction. This emphasizes the fundamental tradeoff between performance and implementation considerations. A 
common way to improve the recovery properties of a reconstruction algorithm is to first upsample the digital signal 
and then apply some simple interpolation method on the resulting finer grid. This is a widely practiced approach 
for sampling rate conversion, where usually a rectangular or triangular interpolation filter is used |[38l . 

Under mild conditions on the interpolation kernel, this approach allows to approximate the optimal unconstrained 
solution arbitrarily well by using a fine enough grid. This, of course, comes at the cost of computational complexity. 
In practice, it is not the asymptotic behavior that interests us, but rather optimizing the performance for a fixed 
setup. Thus, given a fixed oversampling factor K > 1 and an interpolation filter w{t), we would like to design a 
multirate system that processes the samples c[n] and produces fine-grid expansion coefficients d[n] such that the 
reconstruction 

oo 

x{t) = ^ d[n]w (t - ^) (5) 

n=— oo 

well approximates x{t). This setup is depicted in Fig. HI Besides extending the discussion to general interpolation 
filters, we also relax the standard assumption that the input signal is bandlimited. Instead, we design a correction 
system that is best adapted to the prior we have on the input signal. 

The interpolation methods corresponding to the different scenarios discussed above are summarized in Table Jill 
The numbers in the table indicate the equation numbers containing the reconstruction formulas. Interestingly, 




Fig. 4: Fine grid reconstruction using an upsampler followed by a digital compensation filter h[n\ and interpolation 
kernel w{t). The rate of the sequence d[n] is K times larger than that of c[n]. 

TABLE III: Methods for signal recovery 
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we will see that the solutions share a similar structure. Throughout the article, we emphasize commonalities 
and equivalence between the different approaches in order to help design the most appropriate filter for a given 
application. We provide a number of different routes (and formulations) that in many cases lead to the same 
computational solution, while providing several complementary insights into the problem as well as on the notion 
of optimality. 

II. Subspace Priors 

We begin by treating the setting in which the input signal x{t) is known to lie in a given SI subspace. We 
show that when the reconstruction method is not restricted, these priors allow for perfect recovery of x{t) from 
its nonideal samples both in the linear setting of Fig. [T] as well as in the presence of nonlinear distortions as in 
Fig. El Specifically, for any sampling function s{t) there are a broad class of subspace priors under which x{t) 
can be perfectly reconstructed. Conversely, for any given class of functions there are many choices of s{t) that 
will allow for perfect recovery. These filters only have to satisfy a rather mild requirement. The surprising fact is 
that these results are valid even when a memoryless, invertible nonlinearity is inserted prior to sampling, as long 
as the nonlinearity does not vary too fast. 

In the second part of this section, we extend the discussion to constrained reconstruction scenarios. In these 
cases perfect recovery is often impossible, as the restriction narrows down the set of candidate signals which the 
system can output. However, we will show that it is often possible to produce a reconstruction that minimizes the 
squared-norm of the error. 

Throughout this section x{t) is assumed to lie in a SI subspace A generated by a{t) (see (El))- In order for A to 
be well defined and the corresponding sampling theorems to be stable, the functions {a{t — n)} should generate a 
Riesz basis or a frame |10|. To simplify the exposition we focus throughout on the case in which these functions 
are linearly independent and therefore form a basis. However, all the results extend easily to the case in which 
they are linearly dependent. In essence, a Riesz basis is a set of linearly independent vectors that ensures stable 



expansions, namely a small modification of the expansion coefficients results in a small distortion of the signal 
(see Box [A]). In order for a{t) to generate a Riesz basis the continuous-time Fourier transform (CTFT) of a{t) 
must satisfy 

oo 

a< ^ \A{uj - 27rk)\'^ < /3 a.e. (6) 

fc=— oo 

for some constants a > and /3 < oo ||39|. The term in the middle of ^ is the discrete-time Fourier transform 
(DTFT) of the sampled correlation function r^aW = {a{t),a{t — n)). More details on the CTFT and DTFT are 
given in Box|Bj In particular, the functions {a{t — n)} form an orthonormal basis if and only if a = /3 = 1 in 

A. Unconstrained Reconstruction with Linear Sampling 

In the setup of Fig. [T] the input signal x{t) is sampled by a set of sampling functions {s{t — n)}. We denote by 
S the space spanned by these sampUng functions: any f{t) in S is of the form 

oo 

f{t)= d[n]s{t-n) (7) 

n=— oo 

for some bounded-norm sequence d[n]. We assume throughout that s{t) satisfies the Riesz basis condition 

In order to understand what class of signals can be reconstructed from these samples we first observe that 
knowing the samples c[n] of ([T]) is equivalent to knowing the orthogonal projection of x{t) onto S, which we 
denote by xs{t) = Psx{t) (see Box|Cll. Indeed, 

c[n] = {x{t),s{t-n)) = {x{t),Pss{t-n)) = {Psx{t), s{t - n)), (8) 

where we used the fact that Pss{t — n) = s{t — n) and Pg is Hermitian. Since the functions s{t — n) span S, and 
xs lies in S, it is clear that xs can be reconstructed from the samples c[n]. An immediate consequence is that if 
x{t) lies in S so that x{t) = xs{t), then it can be perfectly recovered. 

This geometric interpretation implies that the question of reconstruction from c[n] is equivalent to asking which 
signals can be recovered from knowledge of their orthogonal projection onto S. At first glance it may seem like 
only signals in S may be reconstructed since the projection zeros out any component in 5-*-. However, a closer 
inspection reveals that if we know in advance that x{t) lies in a space A with suitable properties (which we will 
define below), then there is a unique vector in A with the given projection onto S. As depicted in Fig. |5l in this 
case we can draw a vertical line from the projection until we hit the space A and in such a way obtain the unique 
vector in A that is consistent with the given samples. Evidently, perfect recovery is possible for a broad class of 
signals beyond those that lie in S. 

We next discuss how to recover x{t) explicitly using a discrete-time filter as in Fig. [3l We first note that the 
orthogonal projection Psx{t) can be obtained from the samples c[n] by using the scheme in Fig. [3] with w{t) = s{t) 
and h[n] chosen as the impulse response of the filter with DTFT POll . ifTTl 
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Fig. 5: A unique vector in A which is consistent with the samples in S can be recovered from the known samples. 



where S'(lj) is the CTFT of s(t), 

oo 

cj)sA[e^^)A S*{uj -2TTk)A{uj -2TTk), (10) 

fc=— oo 

and (•)* denotes the complex conjugate. Here ^4(0^) is the CTFT of an arbitrary function a{t). The function 
4^8 a{s--^^) is the DTFT of the sampled cross-correlation sequence r^a = 

{s{t),a{t - n)) (See BoxE). Note that 
the Riesz basis condition Q guarantees that ^ is well defined. Efficient implementation of Q, and the filters we 
introduce in the sequel, is possible in spline spaces, based on the results of ifTTl . ||36ll . 1371 . 

To show that the output of the resulting system is Psx{t) note that if x{t) is in S^, then the output will be 
zero since in this case c[n] is the zero sequence. This follows from the fact the inner product of x{t) with any 
signal in S is zero. On the other hand, if x{t) G S, then from (|7]) we can write x{t) = h[n\s{t — n) for some 
sequence h[n\. Using the Fourier relations given in Box|B]it follows that 

oo 

C(e^'^) = S(e^'^) \S{u-27:k)\^ = B{e^^)(t)ss{e^'") ■ (H) 

fc=— oo 

Therefore, d[n] = b[n\ and x{t) = x{t). Consequently, if x{t) lies in S to begin with, then this scheme will ensure 
perfect reconstruction. If in addition s{t) satisfies the partition of unity property, that is s{t — n) = 1 for all 
t, then it can be shown that by selecting the sampling period T sufficiently small, any input signal that is norm 
bounded can be approximated as close as desired by this approach ifTTI . 

The denominator in (O is the DTFT of the sampled correlation function rss[n] = {s{t),s{t — n)). Therefore, 
if the functions {s{t — n)} form an orthonormal basis, then rss[n] = 5[n] and H{e^^) = 1. In this case no 
pre-processing of the samples is necessary prior to reconstruction. This is precisely the setting of the Shannon 
sampling theorem: it is easy to verify that the functions s{t — n) = sinc(t — n) form an orthonormal basis [41] , 

m. 

To extend recovery beyond the space S, suppose that x{t) lies in a known subspace A. Clearly in order to be 
able to reconstruct x{t) from the given samples we need that A and intersect only at zero, since any non-zero 
signal y{t) in the intersection of A and S-^ will yield zero samples and therefore cannot be recovered. Throughout, 
we say that two spaces are disjoint if they intersect only at zero. Intuitively, we also need A and S to have the 
same number of degrees of freedom. These requirements can be made precise by assuming a direct sum condition 
L2 = A® where © denotes a sum of two subspaces that intersect only at the zero vector. This implies that 
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A and S-^ are disjoint, and together span the space of L2 signals. In the SI setting this condition translates into a 



simple requirement on the CTFT of the generators a{t), s{t) of A, S 

|(/>5A(e^")| >«, (12) 

for some constant a > 0, where cjysA (e-''^) is defined by dTOl) . Under this condition, reconstruction can be obtained 
by choosing w{t) = a{t) and IH, Ull, HI, HI, HI 

Hie^"^) = - — ^— . (13) 

When A = S, the filter ^ coincides with ©. 

To see that ([T3] ) ensures perfect recovery for signals in A, note that any x{t) S A can be written as x{t) = 
^„6[n]a(t — n). Using the relations in Box|B]it can be shown that the sequence of samples will have a DTFT 
given by 

C{en = B{en<PsAien, (14) 

from which the result follows. In addition, for any x{t) E S-^ we have immediately that x{t) =0 since c[n\ will 
be the zero sequence. Consequently, the overall system implements an oblique projection E_^s± with range space 
A and null space S-^ P3l . ||39l (see Box0. Indeed, this is the unique operator satisfying Ey^s±x{t) = x{t) for 
any x{t) in A, and E^^±x{t) = for all x{t) in S-^. 

It is also interesting to interpret the proposed sampling scheme as a basis expansion. Since any signal in A 
can be recovered from the corrected samples d[n\ = c[n\ * h[n\ via x{t) = X^n'^N'^C^ ~ ''^)' we may view this 
sequence as the coefficients in a basis expansion. To obtain the corresponding basis we note that by combining 
the effects of the sampler s{t) and the correction filter h[n\ of ([T3l ). the sequence of samples can be equivalently 
expressed as d[n] = {x{t),v{t — n)) where v{t) = h[n\s{t — n). In the Fourier domain, 

y(w) = H{e^'^)S{u:). (15) 

Therefore, we conclude that any x{t) G A can be written as 

00 

x{t)= ^ {x{t),v{t-n))a{t-n). (16) 

n=— 00 

It can be easily verified that the functions {v{t — n)} form a Riesz basis for S, and {v{t — n), a{t — m)) = 5mn 
where 5mn = 1 if m = n and otherwise. Therefore, these functions are the oblique dual basis of {a(t — n)} in S 
lfT4ll . |[T5l . |[T3l . B2I . Il44l . 1311 (see Box [A]). When ^ = 5, we recover the conventional dual basis functions. In 
this case {v{t — n)} forms a basis for S that is dual to the original basis {s{t — n)}: {v{t — n), s{t — m)) = 6nm- 
This provides a concrete method for constructing a dual of a given basis {a{t — n)} in any subspace S satisfying 
the direct sum condition L2 = A(B S-^. 

To conclude our discussion so far, we have seen that a signal x{t) in a SI subspace A generated by a{t), can 
be reconstructed from its generalized samples in Fig. [T] using any choice of s(t) for which (fill) is satisfied. Thus 
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for a given SI space, there is a broad variety of sampling filters we can select from. By choosing the functions 
appropriately, a variety of interesting sampling theorems can be formulated, such as pointwise sampling of non- 
bandlimited signals, bandlimited sampling of nonbandlimited functions, and many more. An example is given 
below. 

Despite the fact that any sampling function s{t) satisfying (fT2l ) can be used to sample x{t) in the space A 
generated by a{t), in the presence of noise out of A, the choice of sampling kernel will effect the reconstructed 
signal. More specifically, we have seen that the output of Fig. [3] with w{t) = a{t) and h[n\ given by (fT3l ) is 
equal to the oblique projection XE{t) = E_^s±x{t). When x{t) G A, we have XE{t) = x{t) for any choice of 
S^, or equivalently any sampling function s{t) in Fig. [3] satisfying (fT2l ). However, if x{t) does not lie entirely 
in A, for example due to noise, then different functions s{t) will result in different approximations XE{t) € A. 
A natural question is: Given an interpolation kernel a{t), which choice of sampling function s{t) will lead to a 
reconstruction x{t) that is closest to x{t)? If we measure the error using the squared-norm \\x{t) — x{t)\\'^, then the 
choice s{t) = a{t) minimizes the error. This follows from the projection theorem which states that the orthogonal 
projection onto A is the closest vector in A to an arbitrary input x{t) [45 1: 



Therefore, since using a kernel a{t) will lead to an interpolation x{t) in A irrespective of h[n\, the smallest error 
will result when x{t) = PAx{t). The orthogonal projection can be achieved only if the sampling function s{t) 
generates A. In addition, in contrast with the orthogonal projection, an oblique projection can increase the norm of 
the noise at the input (see Box 0. In practice, however, we may prefer other choices that are easier to implement 
at the expense of a slight increase in error |[24l . |[T2l . 

We conclude this subsection with a non-intuitive example in which a signal that is not bandlimited is filtered 
with a LPF prior to sampling, and still can be perfectly reconstructed from the resulting samples. 

Consider a signal x{t) formed by exciting an RC circuit with a modulated impulse train — n), as 

shown in Fig. Oa). The impulse response of the RC circuit is known to be a{t) = exp{— t/r}u(t), where 
u{t) is the unit step function and r = RC is the time constant. Therefore 



Clearly, x{t) is not bandlimited. Now, suppose that x{t) is filtered by an ideal LPF s{t) = sinc(t) and then sampled 
at times t = n to obtain the sequence c[n\. The signal x{t) and its samples are depicted in Fig. I3b). Intuitively, 
there seems to be information loss in the sampling process since the entire frequency content of x{t) outside 
[— vr, vr] is zeroed out, as shown in Fig. Oc). However, it is easily verified that condition ([T2l) is satisfied in this 
setup and therefore perfect recovery is possible. The digital correction filter ([T3] ) in this case can be shown to be 



arg mill \\x{t) — = P^x{t). 



(17) 




(18) 




(19) 



13 



signal generator sampling reconstruction 




► x{t) 



(a) Sampling and reconstruction scheme. 




(b) The signal, its samples and expansion coefficients. (c) Frequency content of the signal and sampling filter. 

Fig. 6: A non-bandlimited signal x(t), formed by exciting an RC-circuit with a modulated impulse train, is sampled 
after passing through an ideal LPF and then perfectly reconstructed by re-exciting an identical RC-circuit with 
an impulse train modulated by a digitally filtered version of the samples, (a) Sampling and reconstruction setup, 
(b) The signal x{t) and its samples c[n\. (c) The signal X{lo) and the sampling filter S{u>). Perfect recovery is 
possible despite the fact that a large portion of the frequency content is lost due to the filtering operation. 

Thus, to reconstruct x{t) we need to excite an identical RC circuit with an impulse train modulated by the sequence 
d[n] = h[n\ * c[n\. The entire sampling-reconstruction setup is depicted in Fig. Efa). 

B. Unconstrained Reconstruction with Nonlinear Distortion 

Suppose now that as in the previous section x{t) lies in a subspace A and (fT2l ) is satisfied. However, prior 
to sampling by s{—t) the signal is distorted by a memoryless, nonlinear and invertible mapping M{x) as in 
Fig. |2l A naive approach to recover the signal x{t) from its samples is to first apply M^^ to the sample sequence 
c[n], leading to a sequence d[n\, and then reconstruct x{t) from the samples d[n\ using standard reconstruction 
techniques B6l . However, if the samples c[n\ are not ideal, namely are not pointwise evaluations of x{t), then 
this approach is suboptimal in general. 

A surprising result developed in [47] is that if the nonlinearity is invertible and does not change too fast, then 
it does not introduce theoretical difficulties. More specifically, under the same direct sum condition ([T2l ) we had 
in the linear sampling case, and assuming that the derivative of the nonlinearity is appropriately bounded, there is 
a unique signal x{t) with the given samples c[n\. Therefore, it is enough to seek a recovery x{t) that is consistent 
in the sense that it yields the samples c[n\ after it is reinjected into the system: s{t — n)M {x{t)) dt = c[n]. 



reconstruction sampling 




Fig. 7: One iteration of the nonlinear recovery algorithm. The expansion coefficients at the A;th iteration, dk[n\, are 
used to synthesize an estimate of the signal, Xk{t). This estimate goes through the sampling process to produce 
the corresponding samples Ck[n\. The error with respect to the measured samples, ek[n] = c[k] — Cfc[n], is then 
used to update the estimate. 



Any such signal must be equal to x{t) due to the uniqueness property. This result is important as it allows to 
reformulate the recovery problem in terms of minimizing the error in the samples. 

Since x{t) G A, we can write x{t) = — n) for some sequence d[n]. Thus, our problem reduces to 

finding a sequence d[n] which minimizes the consistency cost function 

fid) = \\c[n]-c[n]l^. (20) 

Here is the ^2-norm of the sequence b[n\, and 



cn 



/oo / oo \ 

s{t - n)M ^ d[m]a{t - m) dt, (21) 
-°° \m=-oo / 



are the estimated samples based on our current guess of x{t). Clearly the minimal value of f{d) is 0. Since M is 
nonlinear, the cost function ( |20l ) is in general non-convex. Therefore optimization algorithms for minimizing (|20l ) 
might trap a stationary point, and not the global minimum which we seek. Surprisingly, it can be shown BTl that 
under the direct sum condition and appropriate bounds on the derivative of M, (l20l ) has a unique stationary point 
which is equal to the global minimum. Therefore, any algorithm designed to trap a stationary point automatically 
leads to perfect recovery. This is despite the fact that the objective (l20l ) is not convex. In Fig. [T] we show a block 
diagram of an iterative approach which is derived by applying a Newton method on (|20l ). This same algorithm 
can also be obtained from an approximate projection onto convex sets strategy, and a linearization approach; see 
BTl for more details. 

At each iteration, the algorithm of Fig. |7] works as follows. Denote by d^ [n] the expansion coefficients at the 
A;th iteration so that Xfc(t) = J2n ^fc ~ Then dfc+i[n] is calculated as 

oo 

dk+i[n] = dk[n] + ak ^ gk[n,m]ek[m], (22) 

m=— oo 

where is the step size, ek[m] = c[m] — Ck[m\ is the error-in-samples sequence with Cfc[n] denoting the 
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(a) 




(b) 



Fig. 8: A signal x{t) lying in a shift-invariant space was linearly sampled after passing through a memoryless 
nonlinear system, (a) Ignoring the nonlinear distortion and filtering the samples c[n\ with the filter H{e^^) of (fT3] ). 
leads to poor reconstruction, (b) The algorithm presented here leads to perfect recovery of x{t). 



approximate samples at stage k obtained via (|2TI) with d[n] = dk[n], and gk[l,m] is a linear system which is 
the inverse of 

hk[l,m]= / s{t-l)M'i ^ dk[n]a{t-n)\a{t-m)dt. (23) 

\n=-oo / 

Here M' denotes the derivative of M. Note that hk[l,m\ is not SI in general and therefore it cannot be inverted 
in the frequency domain to obtain gk[l,m\. In practice, though, one usually analyzes a finite set of samples 
c[n\,0 < n < N — 1. Assuming that c[n\ =0 outside this range, the matrix {gk[l,m\} for < I, m < N — 1 can 
be obtained by inverting the corresponding matrix {hf^[l,m\}. 

We now demonstrate the effectiveness of the algorithm in a scenario similar to that of Fig. [6l Specifically, suppose 
that, as in Fig. [6l x{t) is known to be of the form dTSl ). and we are given the samples c[n] = J^_^ arctan dt. 
This corresponds to using a nonlinear mapping M{x) = arctan(x) and a sampling filter with impulse response 
equal to a rectangular window of length 1. The signal and its samples are depicted in Fig. [Ha). Evidently, the 
samples c[n] constitute a rather poor representation of the signal. Consequently, if one ignores the nonlinearity and 
uses the techniques developed in the previous section, that is filtering with H{e^^) of ([T3l) . then the reconstruction 
error is large (dotted line). In Fig. [8tb) we show the result of applying the algorithm presented here, which leads 
to perfect reconstruction of x{t) from the nonideal samples c[n]. 

C. Constrained Reconstruction 

Up until now we specified the sampling process but did not restrict the reconstruction or interpolation kernel 
w{t) in Fig. [3] In many apphcations this kernel is fixed in advance due to implementation issues. For example, in 
image processing applications kernels with small supports are often used. These include nearest neighbor, bilinear, 
bicubic, Lanczos and splines. The interpolation kernel w{t) can also represent the pixel shape of an image display. 
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Fig. 9: The signal x{t) € A can be recovered from the samples c[n], allowing to compute its orthogonal projection 
onto yV. 

In order to obtain stable reconstruction, we concentrate in the sequel on cases in which w{t) satisfies the Riesz 
basis condition Q. In particular, it can be easily shown that S-splines all satisfy this requirement. 

Given a sampling function s{—t) and a fixed interpolation kernel w{t) an important question is how to design 
the digital filter h[n] in Fig. |3]so that the output x{t) is a good approximation of the input signal x{t) in some 
sense. Clearly, x{t) will always lie in the space W, spanned by the generator w{t). This is because for every 
choice of the sequence d[n\, x{t) has the form x{t) = X]n^[^]^(^ ~ Therefore, if x{t) does not lie in W to 
begin with, then x{t) cannot be equal x{t). Since x{t) is constrained to lie in W, it follows from the projection 
theorem ( fTTl ) that the minimal error approximation to x{t) is obtained when x{t) = Py^x{t). The question is 
whether this solution can be generated from the samples c[n\. In general, the answer is negative without sufficient 
prior knowledge on the signal |[T2l . However, when x{t) lies in a subspace satisfying ([T2l ). Py^)x{t) can be obtained 
by filtering the sample sequence with 



A{uj) and S{uj). In this case, the output of the system of Fig. [3] is given by Py\;E_^s±x{t) |12|. Consequently, if 
x{t) € A, then E_/(g±x{t) = x{t) and the minimal squared-error approximation Py^x{t) is achieved. 

To understand this result geometrically, note that we have already seen in the previous subsection that under the 
direct sum condition, any vector x{t) G A can be recovered from the samples c[n]. This is illustrated in Fig. [5] 
Here, however, we are constrained to obtain a solution in W. But, since we can determine x{t), we can also 
compute P-y\!x{t), which is the minimal squared-error approximation in W. This is shown in Fig. |9] 

D. Dense Grid Recovery 

The situation in which x{t) can be completely determined from its samples but cannot be reproduced by the 
system is somewhat frustrating. Moreover, the error caused by restricting the recovered signal to lie in W may 
be very large if W is substantially different from A. One way to bridge the gap between the unconstrained and 
constrained recovery techniques is to increase the interpolation rate, namely produce a reconstruction of the form 
x{t) = J2=-oo d-iiT-lwit — n/K), for some integer > 1, as depicted in Fig. ID This strategy is legitimate as 




(24) 



where <l)wA (e^"), (t>sA (e^") and <t)ww (e^") 



are as in (fTOl) with the corresponding substitution of the filters W{oj), 



17 



we are still using a predefined interpolation kernel w{t), which may be easy to implement. Thus, we effectively 
introduce a tradeoff between complexity and performance. 

The motivation for this approach can be understood from a geometric viewpoint. As we increase the interpolation 
rate K, the reconstruction space W spanned by the functions {w {t — n/K)} becomes "larger" and consequently 
"closer" to A. In some cases, there exists a factor K for which W contains A, thus recovering the possibility of 
perfect reconstruction. 

In order for the reconstruction to be stable, we focus on the case in which the functions {w {t — n/K)} form 
a Riesz basis. This requirement is satisfied if and only if there exists constants < a < /5 < oo such that 
— Z^i^-oo \^{^ ~ 27r/i^)p < /5 is satisfied almost everywhere. 

Similarly to the setting in which K = 1, it can be shown that when x{t) is in A, the minimal squared error 
solution x{t) = P)/\;x{t) can be attained with the system depicted in Fig.|4l The frequency response of the correction 
filter h[n], which operates on the up-sampled data, is given by 



where Ws{uj) = W{Kuj), As(aj) = A^Kuj), and (psA{e^'^) is defined in ([TOl l. The dense grid recovery scheme of 
Fig. |4]can also be implemented using polyphase filters, which in some cases may lead to a simpler implementation 



Up until now we considered the setting in which the input signal x{t) is constrained to a subspace. We now 
treat a more general and less restrictive formulation of the sampling problem in which our prior knowledge on 
the signal is that it is smooth in some sense. Here we model the extent of smoothness of x{t) as the L2 signal- 
norm at the output of a continuous-time filter L{iu) with x(t) as its input. In practice, L{uj) is often chosen 
to be a first or second order derivative in order to constrain the solution to be smooth and nonoscillating, i.e., 
L{uj) = ao + aija; + a2(iw)^ + • • • for some constants a„. Another common choice is the filter L{uj) = {oq + uj'^)'^ 
with some parameter 7. We denote the output norm as ||Lx(t)j|. For simplicity, we assume that L{uj) > a > 
almost everywhere for some a, although the results extend to the non-invertible case as well. 

Unlike subspace priors, a one-to-one correspondence between smooth signals and their sampled version does not 
exist since smoothness is a far less restrictive constraint than confining the signal to a subspace. Perfect recovery, 
or even error-norm minimization, is therefore impossible. Indeed, it can be shown that there is no single choice of 
x{t) that minimizes \\x{t) — over all smooth signals x{t), even when x{t) is constrained to lie in a subspace 

W. This is because the sample sequence c[n] is no longer sufficient to determine the orthogonal projection P\yx{t) 
|[T2l . Therefore, below, we focus on alternative approaches for designing the reconstruction system. 

A. Unconstrained Reconstruction 

To approximate x{t) from its samples, based solely on the knowledge that it is smooth, we consider two design 
techniques. The first consists of finding the smoothest signal which gives rise to the measured samples c[n] 1,24 J . 




(25) 



in. 



III. Smoothness Priors 
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The second is a minimax strategy in which the system is designed to yield the best approximation for the worst-case 
signal among smooth inputs that are consistent with the samples fT2ll . 

1) Smoothest Approximation: In this approach we require that the reconstructed signal x{t) is smooth and 
consistent with the samples. The consistency requirement means that x{t) should yield the same samples c[n\ 
when reinjected into the system: 

{x{t),s{t - n)) = c[n\ = {x{t), s{t - n)) for all n. (26) 



The simplest strategy to produce a consistent smooth reconstruction is to minimize the smoothness ||La;(t)|| subject 
to the consistency requirement: 

x{t) = argmin ||Lx(t)|| subject to S{x{t)} = c. (27) 

x{t) 

The notation S{x{t)} denotes the sequence of samples (s(t — n),x{t)) and c stands for the sequence {c[n]}. It 
can be shown that the solution to (|27] ) has the form of Fig. |3] where now the reconstruction kernel is 

1^(^)1 



H{en = , \ .^y (29) 



and 



In the previous section, we have seen that the filter ( 1291) corresponds to the choice leading to perfect reconstruction 
for signals x{t) ^ W (see (O)- Thus, this approach can be viewed as first determining the optimal space given 
by (|28] ). and then finding the unique signal in W that is consistent with the given samples. 

As a special case, we may choose to produce the minimal norm consistent reconstruction x{t) by letting L be the 
identity operator I. This leads to w{t) = s{t) and H{e^'^) is then given by Consequently, x{t) is the orthogonal 
projection onto the sampling space, x{t) = Psx{t). This can also be seen by noting that any reconstruction x{t) 
which yields the samples c[n] has the form x{t) = Psx{t) + v{t) where v{t) is an arbitrary vector in S-^. The 
minimal norm approximation corresponds to the choice v{t) = 0. 

2) Minimax Recovery: The reconstruction error ||x(t) — x(t)p of any recovery method depends on the unknown 
original signal x{t). This renders comparison between interpolation methods complicated. Indeed, one algorithm 
may be better than another for certain input signals and worse for others. The next approach we discuss is based 
on optimizing the squared-error performance for the worst input signal. 

The prior information we have can be used to construct a set V of all possible input signals: 

V = {x{t) : S{x{t)} = c, \\Lx{t)\\ < U}, (30) 

were [/ > is some finite constant. The set consists of signals that are consistent with the samples and are 
relatively smooth (with respect to the weighted norm ||Lj;(t)||). We now seek the reconstruction that minimizes 
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(a) Bicubic 



(b) Minimax 



Fig. 10: Mandrill image rescaling: down-sampling by a factor of 3 using a rectangular sampling filter followed by 
upsampling back to the original dimensions using two interpolation methods, (a) The bicubic interpolation kernel 
leads to a blurry reconstruction with PSNR of 24.18dB. (b) The minimax method leads to a sharper reconstruction 
with PSNR of 24.39dB. 



the worst-case error over V: 

min max — (31) 

x{t) x{t)€V 

It can be shown that the optimal solution does not depend on the constant U. Furthermore, the minimax solution 
interestingly coincides with the smoothest approximation method, that is, the optimal interpolation kernel and 
digital compensation filter are given by (1281 ) and ( |29l ) respectively. 

Although the two approaches we discussed are equivalent in the unrestricted setting, the minimax strategy allows 
more flexibility in incorporating constraints on the reconstruction, as we show in the next subsection. Furthermore, 
it tends to outperform the consistency approach when further restrictions are imposed as we will demonstrate via 
several examples. 

Figure \T0\ compares the minimax approach with bicubic interpolation in the context of image enlargement. The 
regularization operator was taken to be L{u:) = ((O.lvr)^ + ||a;|p)^'^, where u: denotes the 2D frequency vector. In 
this example minimax recovery is superior to the commonly used bicubic method in terms of peak signal to noise 
ratio (PSNR), defined as PSNR = 101ogxo(255^/MSE) with MSE denoting the empirical squared-error average 
over all pixel values. In terms of visual quality, the minimax reconstruction is sharper and contains enhanced 
textures. 



B. Constrained Reconstruction 

We next treat the problem of approximating x{t) from its samples c[n] using a pre-specified interpolation kernel 
w{t). Similar to the unrestricted scenario, the two main approaches in this setup are consistent reconstruction 
ll24i . ifTTI . |[T4l . |[T5l . ITBI and minimax recovery |[T2l . P9l . However, here the solutions no longer coincide. These 
methods can both be interpreted in terms of projections onto the spaces W and S that figure in the problem setting. 
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Whereas the first approach leads to an oblique projection, the second strategy involves orthogonal projections, 



rendering this solution more robust to noise jSOll . IdTI . 

1 ) Consistent Reconstruction: In order to incorporate the constraint on the interpolation kernel, we extend (|27l) 
to include the restriction x{t) £ W: 

x{t) = argmin subject to S{x{t)} = c, x{t) G W. (32) 

x{t) 

Recall that under the direct sum condition (fT2l) with W playing the role of A, there is a unique signal in W 
satisfying S{x{t)} = c, which is equal to the oblique projection Eyys±x{t). Since there is only one signal in 
the constraint set of problem ( [32l ). the smoothness measure in the objective does not play a role. The oblique 
projection can be obtained by processing the samples c[n] using the filter 

Hie^"^) = (33) 

Comparing with ([T3] ). we see that this is precisely the filter that yields perfect recovery when we know that 
x{t) € yy. When the direct sum condition is not satisfied, there can be several consistent solutions so that the 
objective in (l32l) is needed in order to select one output among all possibilities ll52l . |[53l . 

2) Minimax Recovery: A drawback of the consistency approach is that the fact that x{t) and x{t) yield the 
same samples does not necessarily imply that x{t) is close to x{t). Indeed, for an input x{t) not in W, the norm 
of the resulting reconstruction error x{t) — x{t) can be made arbitrarily large, if S is close to W^. Furthermore, 
as we have seen, the consistency method essentially ignores the smoothness prior. 

In order to directly control the reconstruction error \\x{t) — x{t)\\'^ , we may modify the minimax strategy of the 
previous subsection to include the restriction x{t) G W. Therefore, our minimax design criterion is now: 

mill max — x(t) |P, (34) 

x(t)GWx{t)GV 



where V is the set of smooth consistent signals given by (1301 ). 

It turns out that the criterion ( [34l ) is too conservative and, for example, in the case in which L is the identity 
operator L = / it results in the trivial solution x{t) =0 |[T2l . To counterbalance the conservative behavior of 
the minimax approach, instead of minimizing the worst-case squared-norm error, we consider minimizing the 
worst-case regret ||54l . The regret is defined as the difference between the squared-norm error and the smallest 
possible error that can be achieved with a reconstruction in >V, namely ||P)4;_Lx(t)|p. This error is attained when 
x{t) = P\^x(t), which in general cannot be computed from the sequence of samples c[n\. Since the regret depends 
in general on x{t), it cannot be minimized for all x{t). Instead we consider minimizing the worst-case regret over 
aU possible signals x{t) that are consistent with the given samples, which results in the problem 

min max \\\^t) - x(t)f - \\Pw±x(t)f\ , (35) 

with V given by (l30l ). Expressing x{t) as x{t) = Pv\)x{t) + Py^±x{t) and recalling that x{t) G W it is easy to see 
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(a) Consistent. (b) Minimax regret. 

Fig. 11: Mandrill image rescaling: down-sampling by a factor of 3 using a rectangular sampling filter followed by 
upsampling back to the original dimensions using a triangular interpolation kernel via the consistent and minimax 
regret methods, (a) The consistent approach over-enhances the high frequencies and results in a PSNR of 22.51dB. 
(b) The minimax regret method leads to a smoother reconstruction with PSNR of 23.69dB. 



that ( [35] ) is equivalent to 



min max \\x(t) — Pw;x(t)\\ . (36) 

x{t)£W x{t)GV 



The solution to ( [36l ) can be shown to be the projection onto W of the unconstrained minimax recovery given 
by (|28]) and ( |29l ). The reconstructed signal x{t) is obtained by digitally filtering the samples c[n] with the filter 

where W{uj) is given by (1281) . 

In Fig. [TT] we demonstrate the difference between the consistent and minimax-regret methods in an image- 
enlargement task. The setup is the same as that of Fig. [TO] only now the reconstruction filter is constrained to be a 
triangular kernel corresponding to linear interpolation. It can be seen that the error of the minimax regret recovery 
is only 0.7dB less than the unconstrained minimax shown in Fig. [TOl The consistent approach, on the other hand, 
is much worse both in terms of PSNR and in terms of visual quality. Its tendency to over-enhance high frequencies 
stems from the fact that it ignores the smoothness prior. 

Many of the interesting properties of the minimax-regret recovery ( [37] ) can be best understood by examining the 
case where our only prior on the signal is that it is norm-bounded, that is, when L is the identity operator L = I. 
This choice of L was thoroughly investigated in |12|. Setting L{uj) = 1 in ([37] ). the correction filter becomes 

Hien = (38) 

(^ss (e^^) (pww (e^'^) 

since from ( [28] ). w{t) = s{t). Applying the Cauchy-Schwartz inequality to the numerator of ( [38] ) and to the 
denominator of ([33]) . it is easy to see that the magnitude of the minimax regret filter ([38]) is smaller than that of 
the consistent filter ([33] ) at all frequencies. This property renders the minimax regret approach more resistant to 
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noise in the samples c[n], since perturbations in x{t) caused by errors in c[n] are always smaller in the minimax 
regret method than in the consistent approach. 

Apart from robustness to digital noise, which takes place after the signal was sampled, the minimax regret 
method is also more resistant to perturbations in the continuous-time signal x{t). To see this note that the minimax 
regret reconstruction is given by x{t) = Py\!Psx{t). Thus, the norm of x{t) is necessarily bounded by that of x{t). 
Furthermore, it is easy to show that the resulting reconstruction error is always bounded by twice the norm of x{t): 
||x(t)— x(t)|p < 2||x(t)|p. In contrast, the consistent recovery is given by the oblique projection x(t) = Ey^s±x{t), 
which may increase the norm of x{t). Consequently, the error of the consistent reconstruction can, in some cases, 
grow without bound. 

In Fig.[T2]we illustrate the minimax regret reconstruction geometrically for the case L = I. We have seen already 
that knowing the samples c[n] is equivalent to knowing xs{t) = Psx{t). In addition, our recovery is constrained 
to lie in the space W. As illustrated in the figure, the minimax regret solution is a robust recovery scheme in 
which the signal is first orthogonally projected onto the sampling space, and then onto the reconstruction space. 

When x{t) is known to lie in S, it follows from the previous section that the minimal error can be obtained by 
using (l24l ) with A{uj) = S{lo). The resulting filter coincides with the minimax regret filter of (|38] ). Consequently, 
the regret approach minimizes the squared-error over all x{t) € S. 

An interesting feature of the minimax regret solution is that it does not depend on the norm bound U. Therefore, 
x{t) = Py\;Psx{t) minimizes the worst-case regret error over all bounded inputs x{t), regardless of the norm of 
x{t). Furthermore, the regret recovery method does not require the direct-sum condition L2 = W © S^, which is 
necessary in the development of the unique consistent approach. 

In lfT2l tight bounds on the error resulting from each of the methods are developed and compared. We omit 
the technical details here and only summarize the main conclusions. We first recall that if we know a priori that 
x{t) lies in a subspace A such that L2 = A® S-^, then the filter (l24l ) will yield the minimal error approximation 
of x{t) and therefore is optimal in the squared-norm sense. When A = S this strategy reduces to the minimax 
regret method, while if ^ = W, then we obtain the consistent reconstruction. When no prior subspace knowledge 
is given, the regret approach is preferable if the spaces S and W are sufficiently far apart, or if x{t) has enough 
energy in S. These results are intuitive as illustrated geometrically in Fig. [12] In Fig. fl^ a) we depict the consistent 
and regret reconstruction when W is far from S. As can be seen in the figure, in this case the error resulting from 
the consistent solution is large with respect to the regret approximation eiTor. In Fig. [Tll b). W and S are close, 
and the errors have roughly the same magnitude. 

C. Minimax Dense Grid Reconstruction 

We now extend the minimax regret approach to the dense-grid recovery setup of Fig. |4l in which the interpolation 
is performed using a predefined kernel w{t) on a grid with 1/K spacings. To treat this scenario within the minimax- 
regret framework, we need to solve ( [34l ) with the appropriate reconstruction space, namely W = span{w(t— n/i^)}. 
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Fig. 12: Comparison of minimax regret reconstruction {Py^Psx{t)) and consistent reconstruction {E-y^s±x{t)) for 
two different choices of W. (a) The minimax strategy is preferable when W is 'far' from S. (b) Both methods 
lead to errors on the same order of magnitude when W is 'close' to S. 

The corresponding correction filter can be shown to be 



where W,{uj) = W{Kuj) and W^{uj) = W{Kuj) with W{uj) of 

To understand the necessity of fine grid interpolation, note that there is no analytic expression for the optimal 
unconstrained kernel ( |28l ) in the time domain. In rate conversion situations, where the output rate is an integer 
multiple of the input rate, the kernel w{t) needs to be calculated only on a discrete set of points. This is because 
x{n/K) = — rn), where K is the oversampling factor. To approximate the sequence {w{n/K)} 

on a finite set of indices, one can sample the expression W{u} — 2ttIK) on a finite set of frequencies and apply 
the inverse DFT. However, if x{t) must be evaluated at arbitrary locations, then this method cannot be used. 

In the previous subsection we have seen that this problem can be tackled by using a predefined interpolation 
kernel for which a formula exists. An alternative approach is to first evaluate the optimal kernel ( |28] ) on a dense 
grid, and then use nearest neighbor or linear interpolation to obtain the values at the desired locations. This is 
referred to as first and second order approximation [38 1. It is easy to show that these approaches correspond to 
using the high-rate scheme of Fig. |4] with the digital correction filter 



to be performed in the second stage. This is in contrast with the dense grid approach presented here, where the 
correction filter ( [39l ) explicitly depends on w{t). Filter ( [39l ) shapes the spectrum of the up-sampled sequence in a 
way that partially compensates for the non-optimal kernel to follow. 

In Fig. [13] we compare the minimax-regret dense-grid reconstruction approach and first-order approximation to 
the unconstrained filter. To emphasize the differences, we used both methods to enlarge an image by an irrational 
factor vr/e. It is clearly seen that the first-order approximation approach produces blurry reconstruction whereas 




(39) 



hap[n\ = g{n/K) 



(40) 



and with a rectangular or triangular interpolation kernel w{t). Here G{lo) = H{e^'^)W{uj) with H{e^'^) of ( [291 ) 
and W{uj) given by ([28l) . Note, therefore, that this method does not take into account the non-optimal interpolation 
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(a) First order approximation of the minimax approach. (b) Dense grid minimax regret. 



Fig. 13: Comparison of first order approximation to the minimax method and dense grid minimax regret. In both 
methods the image is up-sampled by a factor of K = 2 and digitally filtered. Then, linear interpolation (triangular 
kernel) is used to calculate the gray-level values at the desired locations, (a) First order approximation to minimax 
regret (l40t . (b) Dense grid minimax regret ( [39l) . 

in the minimax method the edges are sharp and the textures are better preserved. 

IV. Stochastic Priors 

In this section we treat signal priors of stochastic nature. Specifically, the input x{t) is modeled as a WSS 
random process having PSD A^^(aj). Our goal is to linearly estimate x(t) given the samples c[n\. As we will see, 
the schemes resulting from these considerations have strong connections to the minimax methods of the previous 
section. In addition, this viewpoint also offers a nice explanation to reconstruction artifacts, frequently encountered 
in applications. 

A. Unconstrained Reconstruction 

We first examine constrained-free reconstruction. In the deterministic setting with smoothness prior we could 
not minimize the squared-error \\x{t) — for all smooth x{t), and therefore discussed a minimax method 

instead. In contrast, in the stochastic setting we can use the PSD A^xi^^) of x(t) in order to minimize the MSE 
E[\x{t) — for every t, which depends only on the statistics of x{t) and not on the signal itself. 

Our approach is to minimize the MSE by linear processing of the samples c[n\. As opposed to the common 
Wiener filtering problem, where both the input and output are either continuous- or discrete-time signals, here we 
are interested in estimating the continuous-time signal x{t) based on equidistant samples of y{t) = x{t) * s{—t). 
Consequently, we refer to this as the hybrid Wiener filtering problem. 
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The reconstruction x{t) minimizing the MSE can be implemented by the block diagram in Fig. [3] with the 
interpolation kernel 121, ISSl, 131, l57]| 

W{lo) = S{lo)A,,{lo), (41) 

and digital correction filter 

H(e^'^) = - (42) 

Ek=-oo\S{io-27Tk)\^A,,{u;-27Tk)- ^ ' 

It is interesting to observe that (|4TI ) and (l42l ) are identical to (1281 ) and ( |29l ) with K^xi'^) = \L{i^)\~'^- Therefore, the 
smoothness operator in the deterministic case corresponds to the whitening filter of the input x{t) in the stochastic 
setting. 



B. Constrained Reconstruction 

We now treat a more practical constrained setting, in which the interpolation filter is fixed in advance. Unfortu- 
nately, in this case, for a general given interpolation kernel, there is no digital correction filter that can minimize the 
MSE for every t l50l . In fact, the filter minimizing the MSE at a certain time instant to also minimizes the MSE at 
times {Lq + n] for all integer n, but not over the whole continuum. Therefore, error measures other than pointwise 
MSE must be considered. Before treating the problem of choosing an appropriate criterion, we first discuss how 
this time dependence phenomenon is related to artifacts commonly encountered in certain interpolation methods. 

The signal x{t) in our setup is assumed to be WSS and, consequently, the sequence of samples c[n] is a discrete 
WSS random process, as is the output d[n] of the digital correction filter in Fig. |4l The reconstruction x{t) is 
formed by modulating the shifts of the kernel w{t) by the WSS discrete-time process d[n]. Assuming that the 
PSD of d[n\ is positive everywhere, signals of this type are not stationary unless w{t) is vr-bandlimited l48l . 
Generally, x{t) will be a cyclostationary process. In practice, the interpolation kernels in use have a finite (and 
usually small) support, and are therefore not bandlimited. In these cases, the periodic correlation in x{t) often 
degrades the quality of the reconstruction, as subjectively perceived by the visual or auditory system. 

Note that although natural signals are rarely stationary to begin with, it is still relevant to study how an 
interpolation algorithm reacts to stationary signals. In fact, if an interpolation scheme outputs a cyclostationary 
signal when fed with a stationary input, then it will commonly produce reconstructions with degraded subjective 
quality also when applied to real world signals, as demonstrated in Fig. [I4l However, periodic structure in a 
recovered signal is not necessarily related to MSE. For example, the optimal unrestricted kernel (|4TI ) is usually 
not bandlimited and thus leads to periodic structure in x{t). 

The nonstationary behavior of x{t) is the reason why the pointwise MSE can not be minimized for every t in 
general. Two alternative error measures that have been proposed are the sampling-period-average-MSE and the 
projected MSE. 

The sampling-period-average-MSE utilizes the periodicity of the MSE, and integrates it over one period l58l. 
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(b) Rectangular kernel. 




(c) Bicubic kernel. 




(a) Original low resolution image. 



(d) Sine kernel. 

Fig. 14: Periodic structure in an interpolated signal is a phenomena related to the effective bandwidth of the 
interpolation kernel. The larger the portion of its energy outside [— 7r,7r], the stronger the periodic correlation. The 
three images on the right were obtained by scaling a patch of the original image by a factor of 5 using three 
different methods. The portion of energy in the range [— 7r,7r] of the kernels is: (b) rectangular kernel - 61%; 
(c) bicubic kernel - 91%; (d) sine - 100%. Suppressed periodic correlation, however, does not necessarily imply 
that the reconstruction error is small. 



MSEa = E 



io+l 



\x{t) - x{t)\'^dt 



(43) 



It turns out that minimization of the average MSB leads to a correction filter independent of Iq |48 |. The second 
approach makes use of the fact that the best possible approximation to x{t) in W is Py^x{t). Therefore this method 
arms at minimizing the projected MSB, defined as the MSB with respect to the optimal approximation in W [|50il : 



MSBp = E [\Pwx{t) - x{t)f 



(44) 



Interestingly, both error measures (|43] ) and (l44l) lead to the same digital correction filter, which is given by t48t . 



m 



^sw (e^") ^ww (e^") ' 



(45) 



where here W{uj) = S{lu)Axx{uj). This is also the solution obtained by the minimax regret criterion (see ( [37l )) 
where \L{uj)\^'^ replaces the spectrum Axxi^^)- Therefore, here again, L{io) plays the role of the whitening filter 
of x{t). 
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The average MSE criterion (l43l) can also be used to handle the dense grid recovery setup of Fig. IH Minimiza- 
tion of the average MSE, in this case, leads to the corresponding minimax regret solution (|39l ) with W{uj) = 

The mathematical equivalence between the minimax and Wiener formulas suggests selecting an "optimal" 
operator L{uj) in the minimax formulation that "whitens" the signal. In practice, one can either choose L{lo) in 
advance to approximately whiten signals typically encountered in a specific application (e.g., magnetic resonance 
images), or specify a parametric form for L{uj) and optimize the parameters based on the samples c[n\ f32ll . 

V. Sparsity Priors 

Before concluding this review, we briefly address sampling of sparse analog signals, a topic that has gained 
considerable interest in recent years. 

Throughout the review we focused mainly on linear interpolation techniques which were sufficient to recover, 
or properly approximate, many classes of signals. An important case where nonlinear methods are necessary is 
when the prior on the signal is not a subspace but rather a sparsity constraint. For example, we may know that 
the signal has the form 

N-l 

x{t) = akg{t - tk) (46) 
fc=i 

for some coefficients and time instants t^- Such a signal is said to have a sparse representation since only a few 
parameters are needed to specify it ll22l . ll23l . If the values tk are known then this is just a subspace prior. More 
interesting is when the times need to be estimated along with the coefficients a^. Several sampling strategies to 
deal with these signal classes have been suggested, known as finite rate of innovation sampling. It turns out that 
roughly 2N samples are enough to recover the entire signal with proper post processing. 

Another important class of sparse signals is the class of signals whose frequency transform (or any other 
transform) has a multiband structure. In this case the Fourier transform consists of a finite number of bands, 
each of width at most B, as illustrated in Fig. \T5\ If the band locations Oj, 6j are known, then this corresponds to 



B 




T 

xij) 

Fig. 15: Typical spectrum support of a multiband signal. 

a subspace prior and the methods discussed in this review can be used to recover the signal from its samples |[59l , 
||60l. ll6ll . A very interesting question is whether the signal can be recovered at a rate lower than the Nyquist rate, 
1/r in the figure, when the band locations are unknown. Such a sampling scheme is referred to as blind since it 
does not require knowledge of the band locations in the sampling and reconstruction stages. 
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At first, it may seem that since the band locations are unknown, the signal can have energy in the entire 
Nyquist frequency range, and therefore lower than Nyquist sampling will not be sufficient to recover the signal. 
Surprisingly, this reasoning is in fact incorrect. In practice such classes of signals can be sampled at rates much 
lower than Nyquist, without impairing the ability to perfectly recover the signals lITSl . The tradeoff is in that the 
reconstruction involves nonlinear processing of the samples. In fact, it can be proven that the minimal sampling 
rate at which such signals can be processed is twice the sum of the widths of the frequency bands, which can be 
significantly smaller than twice the highest frequency, corresponding to the Nyquist rate. When the band locations 
are known, the minimal sampling rate is exactly the sum of the bands, referred to as the Landau rate. Thus, the 
price for constructing an ideal blind system is only a factor of two. (In the presence of noise and other distortions, 
a larger factor will be necessary to ensure stable recovery.) 

The techniques developed to sample and reconstruct such classes of signals are based on ideas and results from 
the emerging field of compressed sensing |[T6]| . ifTTl . However, while the latter deals with sampling of finite vectors, 
the multiband problem is concerned with analog sampling. By using several tools, developed in more detail in 
im, |[20l . ||211 . it is possible to extend the essential ideas of compressed sensing to the analog domain. In this 
setting, the measurement matrix of traditional compressed sensing is replaced by a bank of analog filters. This 
broader framework combines ideas presented in this review with traditional compressed sensing tools in order to 
treat more general classes of signals, such as signals that lie in a union of SI spaces. In this case, 

K oo 

^(*)=Z] dk[n]ak{t-n), (47) 

k=l n=— oo 

for a set of generators ak{t) where only M < K out of the sequences d[n] are not identically zero. This model 
subsumes the bandlimited class of functions as a special case. These results can also be applied more generally 
to signals that lie in a union of subspaces ll62l . |[63l . which are not necessarily shift invariant. 

VI. Unified View and Practical Considerations 

In this article, we reviewed a panorama of methods for recovering a signal from its samples. The solutions 
emerged from different assumptions on the underlying signal, the sampling process and the reconstruction mech- 
anism. Whereas it is generally well understood how to model the sampling process in real-world applications, 
choosing the signal prior and the reconstruction scheme is typically left to the practitioner. These components 
affect both the performance and the computational load of the resulting algorithm. 

Below, we emphasize commonalities and equivalence between the different methods in order to help the 
practitioner design the most appropriate filter for a particular application. We focus on the linear sampling scheme 
of Fig. [T] and on the reconstruction method of Fig. [3l in which the sampling and interpolation grids coincide. 
Similar considerations can be applied in the more general scenarios we surveyed as well. 

The linear recovery algorithms corresponding to the first two columns of Table |lll] share a common structure. 
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x{t) WSS witii PSD Axx{oj) 


Prior Filter 
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S{co)/\L{u;W 





The digital correction filter H{e^'^) of Fig. [3] can be written in all cases in the form 

H (e^-) = '^^^ (48) 

where (pspie^'^) is defined by dTOt . Here S{uj) and W{uj) are the CTFTs of the sampling and reconstruction filters, 
and P{io), referred to as the prior filter, shapes the spectrum of x{t) according to the prior. The different priors 
together with the corresponding filters are summarized in Table |IVl 

The reconstruction filter W{uj) can either be chosen in advance to lead to efficient implementation, or optimized 
according to the prior. The solutions in the unconstrained setting can be recovered from (1481) with W{lo) = P{io)- 
Indeed, in this case, the filter of (1481 ) reduces to 

H(e^'^) = - — (49) 

Substituting the values of the prior filter P{e^^) into (|49l ) according to Table |IV] leads to the first column of 
Table In] This filter also guarantees perfect reconstruction for any signal lying in the SI space spanned by the 
functions {p{t — n)}, offering an additional viewpoint on the prior filter P{uj): It defines the SI space for which 
perfect reconstruction is obtained using ( |49l ). 

When the reconstruction filter W{uj) is fixed in advance, substituting the values of P{uj) from Table HVl into 
(|48] ) results in the second column of Table JIIJ with the minimax solution in the case of a smoothness prior. The 
consistent solution follows from choosing P{lu) = W{lo). 

In general, practical evaluation of ( |48l ) may be difficult. One brute-force technique is to truncate the infinite 
series in (ITOl ) required for the computation of (psp{e^'^), 4)wp{^'^'^) and (pyywi^'^'^)- Any filter design technique 
can then be used to approximate this desired response with an FIR or IIR filter. There are however important 
cases in which H{e^'^) can be determined analytically. An important example is when s{t), w{t) and p{t) are all 
B -splines ll64l . |[T2 I. The numerator in (1481 ) then corresponds to an FIR filter with a simple formula. Each of the 
terms in the denominator correspond to a concatenation of a causal and anti-causal IIR filter. We may therefore 
first filter the data with a recursive formula running from left to right and then filter the result with the same 
formula running from right to left ll64l . 

The unified interpretation of the different interpolation algorithm highlights the fact that choosing a specific 
method is equivalent to choosing the prior filter. This filter, in turn, should be matched to the typical frequency 
content of the input signals. In addition, we have also seen that a general purpose recovery algorithm (i.e., one 
which can handle resampling at arbitrary points) requires an explicit expression for w{t) in the time domain. This 
should be taken into consideration when choosing the prior in the unconstrained approach since w{t) = p{t) in 
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this case. Therefore, a kernel p{t) with an analytic formula is beneficial. 

Various priors have been previously proposed in the literature. A common choice corresponds to selecting 
L{u!) = ao + aijio + a2{jio)'^ + • • • + axij^)^ as a differential operator together with S{lo) chosen as a rational 
causal filter |[50l . Another type of prior is associated with a family of WSS processes including the Matern class 
ll33l . The regularization filter in this case is given by L{ijj) = Y[m=ii'^i'n + where a; denotes the 2D 

frequency vector, > and 7^ > 1- The resulting kernel p{t) was shown to have a closed form in the case of 
pointwise sampling (i.e., S{u>) = 1) [65j. A non-isotropic version of this kernel was also used in Oil , where the 
authors optimized the model parameters based on the samples c[n]. 

Finally, we comment briefly on the reconstruction filter w{t). The key consideration in choosing w{t) is its 
support, which determines the number of coefficients of the corrected sequence d[n\ participating in computing 
^(^0) = J2 d[n\'w{tQ — n). Typically, kernels with support up to 4 are used, requiring 4 multiplications per time 
instant to compute x(to) (or 16 in two dimensions). These include B-splines of degree to 3 whose supports 
are 1 to 4 respectively, the Keys cubic interpolation kernel |[66l whose support is 4, and the Lanczos kernel with 
support 4. Some of the commonly used kernels, such as Keys and Lanczos, possess the interpolation property, 
namely w{n) = 5[ri\. This implies that if we are given pointwise samples of the signal c[n] = x{n), then no 
correction filter h[n] is needed in order to obtain a consistent reconstruction satisfying x{n) = c[n]. 

Appendix A 
Box[Aj Basis Expansions 

A Schauder basis for a complex Hilbert space is a countable set of vectors {xn} in Ti such that every vector 
X can be written uniquely as a series 

X = c[n]xn (50) 

n=— 00 

with scalars c[n]. For example, the set of complex exponentials Xn{t) = ex.p{jujnt} defined over t e [— 7r,7r] is 
a Schauder basis for the space L2[— vr, vr] of square integrable functions over [— vr, vr]. In this basis, the expansion 
coefficients c[n\ of a function x{t) are its Fourier coefficients. 

A countable set of vectors {x„} in 7i is a Riesz basis for H if it is complete and there exist two constants 
a > and /? < 00 such that 

2 



00 

a 

n=— 00 



< 



P E I^MP, Vcg£2. (51) 



Here ||y|| is the norm over H. Riesz bases have the desired stability property, namely that a slight change in the 
expansion coefficients c[n\ is ensured to entail only a small change in x. Consequently, these bases are important 
in ensuring stable sampling schemes. 

An important question is how to obtain the expansion coefficients c[n] of a vector x. If the basis is 
orthonormal (i.e., {xm, Xn) = &mn where (x, y) is the inner product over Ti) then c[n] = (x, x^). This follows from 
taking the inner products of both sides of (|50l ) with x^ and exploiting the orthogonality property. To determine the 
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expansion coefficients when using a general non-orthogonal basis, we follow a similar route using the biorthogonal 
vectors, or dual basis x„. The dual basis of x„ is the unique basis of H that satisfies the property 

{XfYii^n) — ^mn- (52) 

Taking the inner products of both sides of dSOl ) with respect to Xm, we find that 

c[n] = {x,Xn). (53) 

If Riesz basis, then so is its biorthogonal basis. 

When the set of vectors span only a subspace U of H, there may be many choices of biorthogonal bases 
in H satisfying (l52l ). Intuitively, the biorthogonal basis vectors should span a subspace with the same number of 
degrees of freedom as U. A formal statement of this observation is that given any subspace V satisfying the direct 
sum condition TC = U ® V"*", there exists a unique set of vectors {xn} lying in V which constitute a biorthogonal 
basis for {xn}. This set is called the oblique dual basis of {x„} in V El, ||T5l, US, El, Ell, ED. The vectors 
{xn} satisfy ( [52l ) and form a basis for V, that obey the Riesz condition given that is a Riesz basis. In each 
subspace V there is only one dual basis. The canonical dual basis refers to the choice U = V. This concept can 
also be extended to sets of vectors that are linearly dependent, leading to oblique dual frames. 

Appendix B 

BoxEJ Discrete and Continuous Fourier Transforms 
The continuous-time Fourier transform (CTFT) of a signal x{t) in L2 is defined as 

POO 

X{uj) = / x{t)e'^'^^dt. (54) 



We use the convention that upper case letters denote Fourier transforms. The discrete-time Fourier transform 
(DTFT) of a sequence x[n\ in £2 is defined by 

00 

X{e^'^)= J2 x[n]e-^'^''. (55) 

n=— 00 

The DTFT is 27r -periodic; to emphasize this fact we use the notation X{e^'^). 
The DTFT of the sampled sequence y{t = n) is related to the CTFT of y{t) by 

00 

y(eJ'^) = ^ Y{uj- 2TTk). (56) 

A:=— 00 

In the reverse direction, if the sequence d[n] is used to create a continuous-time signal f{t) = — n), 

then 

F{lo) = D{e^'^)Y{uj). (57) 

An important sequence encountered in signal recovery problems is the sampled cross correlation ras[n] = 
{a{t),s{t — n)). This sequence can be obtained by sampling the output of the filter s{—t) with a{t) as its input. 
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Pvx V Evwx V 

(a) Orthogonal projection. (b) Oblique projec- 
tion. 

Fig. 16: Decomposition of a vector x into two components using an orthogonal projection and an oblique projection. 
An immediate consequence from (l56l ) is that the DTFT of Vas [n] can be expressed as 

oo 

05^(eJ^) = S*{uj -2TTk)A{uj -2TTk), (58) 

fc=— oo 

where (•)* denotes the complex conjugate. 

The set {a{t — n)} is orthonormal if each function a{t — n) is orthonormal to all of its integer shifts. This is 
equivalent to requiring that raa[n\ = 5[n] where 

f 1 n = 0; 

5[n] = I (59) 
I otherwise. 

From (|58] ) we conclude that {a{t — n} is an orthonormal sequence if and only if 

oo 

(t>aa{en= K^-2vrA:)|2 = l. (60) 

fc=— oo 

Appendix C 
Box[Cj Projections in Hilbert Spaces 

A projection S in a Hilbert space H is a linear operator from Ti onto itself that satisfies the property 

E'^ = E. (61) 

The importance of projection operators is that they map the entire space H. onto the range space 7i{E), and leave 
vectors in this subspace unchanged. Furthermore, property (|6TI ) implies that every vector in H. can be uniquely 
written as the combination of a vector in TZ{E) and a vector in the null space M{E), that is, we have the direct sum 
decomposition H = TZ{E) ®J\f{E). This is illustrated in Fig. [T6]for two different projection operators. Therefore, 
a projection is completely determined by its range space and null space. 

An orthogonal projection P is a Hermitian projection operator. In this case the range space '1Z{P) and null space 
M{P) are orthogonal. Therefore, an orthogonal projection is completely determined by its range space. We use 
the notation Py to denote an orthogonal projection with range V = TZ{P\)). 

An oblique projection -Evw is an operator satisfying the projection property ( [6T] ) that is not Hermitian. Its range 
space is given by V so that E\;y^x = x for any x £ V, and its null space is given by W so that E\;y^x = for 
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any x e W. 

When decomposing the space using an orthogonal projection, the vectors comprising the decomposition are 
orthogonal, since 7^(P) and M{P) are orthogonal spaces. This is not true when using an oblique projection, as 
illustrated in Fig. [161 Another important feature of the orthogonal projection is that the norm of the projection is 
never larger than the original norm: 

||Pv2;|| < 11x11. (62) 

This inequality does not necessarily hold for an oblique projection. In fact, the norm of the oblique projection 
can be much larger than the signal norm. Consequently, algorithms relying on the oblique projection can cause a 
significant increase in the noise if it is not constrained to the range space of the projection. On the other hand, 
orthogonal projections are more stable in the presence of noise due to (l62l ). 
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